Rejection sampling

In mathematics, rejection sampling is a basic pseudo-random number sampling technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm".

It generates sampling values from an arbitrary probability distribution function $f(x)$ by using an instrumental distribution $g(x)$ , under the only restriction that $f(x)< M g(x)$ where $M>1$ is an appropriate bound on $f(x)/g(x)$ .

Rejection sampling is usually used in cases where the form of $f(x)$ makes sampling difficult. Instead of sampling directly from the distribution $f(x)$ , we use an envelope distribution $Mg(x)$ where sampling is easier. These samples from $Mg(x)$ are probabilistically accepted or rejected.

This method relates to the general field of Monte Carlo techniques, including Markov chain Monte Carlo algorithms that also use a proxy distribution to achieve simulation from the target distribution $f(x)$ . It forms the basis for algorithms such as the Metropolis algorithm.

The unconditional acceptance probability is the proportion of proposed samples which are accepted, which is $Pr(u<f(x)/Mg(x))$ $= E[f(x)/Mg(x)] =$ ∫ $(f(x)/Mg(x))g(x)dx = (1/M)$ ∫ $f(x)dx = 1/M$ . If $M$ is low, fewer samples are rejected, and the required number of samples for the target distribution is obtained more quickly. Because $M$ must be no less than the maximum of $f(x)/g(x)$ , the unconditional acceptance probability is higher the less that ratio varies, however to obtain acceptance probability 1, $f(x)=g(x)$ , which defeats the purpose of sampling.

1 Algorithm
2 Examples
3 Drawbacks
4 See also
5 References

Algorithm

The algorithm (used by John von Neumann and dating back to Buffon and his needle) is as follows:

Sample $x$ from $g(x)$ and $u$ from $U(0,1)$ (the uniform distribution over the unit interval)
Check whether or not .
- If this holds, accept $x$ as a realization of $f(x)$ ;
- if not, reject the value of $x$ and repeat the sampling step.

The validation of this method is the envelope principle: when simulating the pair $(x,v=u*Mg(x))$ , one produces a uniform simulation over the subgraph of $Mg(x)$ . Accepting only pairs such that $u<f(x)/Mg(x)$ then produces pairs $(x,v)$ uniformly distributed over the subgraph of $f(x)$ and thus, marginally, a simulation from $f(x)$ .

This means that, with enough replicates, the algorithm generates a sample from the desired distribution $f(x)$ . There are a number of extensions to this algorithm, such as the Metropolis algorithm.

Examples

As a simple geometric example, suppose it is desired to generate a random point within the unit circle. Generate a candidate point $(x,y)$ where $x$ and $y$ are independent uniformly distributed between −1 and 1. If it so happens that $x^2%2By^2 \leq 1$ then the point is within the unit circle and should be accepted. If not then this point should be rejected and another candidate should be generated.

The ziggurat algorithm, a more advanced example, is used to efficiently generate normally-distributed pseudorandom numbers.

Drawbacks

Rejection sampling can lead to a lot of unwanted samples being taken if the function being sampled is highly concentrated in a certain region, for example a function that has a spike at some location. Also, as the dimensions of the problem get larger, the ratio of the embedded volume to the "corners" of the embedding volume tends towards zero, thus a lot of rejections can take place before a useful sample is generated, thus making the algorithm inefficient and impractical. See curse of dimensionality.

References

Robert, C.P. and Casella, G. "Monte Carlo Statistical Methods" (second edition). New York: Springer-Verlag, 2004.
J. von Neumann, "Various techniques used in connection with random digits. Monte Carlo methods", Nat. Bureau Standards, 12 (1951), pp. 36–38.

Rejection sampling

Contents

Algorithm

Examples

Drawbacks

See also

References